Genetic control of flowering

ABSTRACT

The CONSTANS(CO) gene of  Arabidopsis thaliana  and homologues from  Brassica napus  are provided and are useful for influencing flowering characteristics in transgenic plants, especially the timing of flowering.

This invention relates to the genetic control of flowering in plants andthe cloning and expression of genes involved therein. More particularly,the invention relates to the cloning and expression of the CONSTANS(CO)gene of Arabidopsis thaliana, and homologues from other species,including Brassica napus and manipulation and use of the gene in plants.

Efficient flowering in plants is important, particularly when theintended product is the flower or the seed produced therefrom. Oneaspect of this is the timing of flowering: advancing or retarding theonset of flowering can be useful to farmers and seed producers. Anunderstanding of the genetic mechanisms which influence floweringprovides a means for altering the flowering characteristics of thetarget plant. Species for which flowering is important to cropproduction are numerous, essentially all crops which are grown fromseed, with important examples being the cereals, rice and maize,probably the most agronomically important in warmer climatic zones, andwheat, barley, oats and rye in more temperate climates. Important seedproducts are oil seed rape and canola, sugar beet, maize, sunflower,soyabean and sorghum. Many crops which are harvested for their rootsare, of course, grown annually from seed and the production of seed ofany kind is very dependent upon the ability of the plant to flower, tobe pollinated and to set seed. In horticulture, control of the timing offlowering is important. Horticultural plants whose flowering may becontrolled include lettuce, endive and vegetable brassicas includingcabbage, broccoli and cauliflower, and carnations and geraniums.

Arabidopsis thaliana is a facultative long day plant, flowering earlyunder long days and late under short days. Because it has a small,well-characterized genome, is relatively easily transformed andregenerated and has a rapid growing cycle, Arabidopsis is an ideal modelplant in which to study flowering and its control.

We have discovered that one of the genes required for this response tophotoperiod is the CONSTANS or CO gene, also called FG. We have foundthat plants carrying mutations of this gene flower later than theirwild-types under long days but at the same time under short days, and weconclude, therefore, that the CO gene product is involved in thepromotion of flowering under long days.

Putterill et al, Mol. Gen. Genet. 239: 145-157 (1993) describespreliminary cloning work which involved chromosome walking with yeastartificial chromosome (YAC) libraries and isolation of 1700 kb ofcontiguous DNA on chromosome 5 of Arabidopsis, including a 300 kb regioncontaining the gene CO. That work fell short of cloning andidentification of the CO gene.

We have now cloned and sequenced the CO gene (Putterill et al., 1995),which is provided herein. Unexpected difficulties and complications wereencountered which made the cloning harder than anticipated, as isdiscussed below in the experimental section.

According to a first aspect of the present invention there is provided anucleic acid molecule comprising a nucleotide sequence encoding apolypeptide with CO function. Those skilled in the art will appreciatethat “CO function” may be used to refer to the ability to influence thetiming of flowering phenotypically like the CO gene of Arabidopsisthaliana (the timing being substantially unaffected by vernalisation),especially the ability to complement a co mutation in Arabidopsisthaliana, or the co phenotype in another species. CO mutants exhibitdelayed flowering under long days, the timing of flowering beingsubstantially unaffected by vernalisation (see, for example, Korneef etal. (1991)).

Nucleic acid according to the present invention may have the sequence ofa CO gene of Arabidopsis thaliana, or be a mutant, derivative or alleleof the sequence provided. Preferred mutants, derivatives and alleles arethose which encode a protein which retains a functional characteristicof the protein encoded by the wild-type gene, especially the ability topromote flowering as discussed herein. Other preferred mutants,derivatives and alleles encode a protein which delays flowering comparedto wild-type or a gene with the sequence provided. Changes to asequence, to produce a mutant or derivative, may be by one or more ofaddition, insertion, deletion or substitution of one or more nucleotidesin the nucleic acid, leading to the addition, insertion, deletion orsubstitution of one or more amino acids in the encoded polypeptide. Ofcourse, changes to the nucleic acid which make no difference to theencoded amino acid sequence are included.

A preferred nucleic acid sequence for a CO gene is shown in FIG. 1,along with the encoded amino acid sequence of a polypeptide which has COfunction.

The present invention also provides a vector which comprises nucleicacid with any one of the provided sequences, preferably a vector fromwhich polypeptide encoded by the nucleic acid sequence can be expressed.The vector is preferably suitable for transformation into a plant cell.The invention further encompasses a host cell transformed with such avector, especially a plant cell. Thus, a host cell, such as a plantcell, comprising nucleic acid according to the present invention isprovided. Within the cell, the nucleic acid may be incorporated withinthe chromosome. There may be more than one heterologous nucleotidesequence per haploid genome. This, for example, enables increasedexpression of the gene product compared with endogenous levels, asdiscussed below.

A vector comprising nucleic acid according to the present invention neednot include a promoter or other regulatory sequence, particularly if thevector is to be used to introduce the nucleic acid into cells forrecombination into the genome.

Nucleic acid molecules and vectors according to the present inventionmay be provided isolated from their natural environment, insubstantially pure or homogeneous form, or free or substantially free ofnucleic acid or genes of the species of interest or origin other thanthe sequence encoding a polypeptide able to influence flowering, eg inArabidopsis thaliana nucleic acid other than the CO sequence.

Nucleic acid may of course be double- or single-stranded, cDNA orgenomic DNA, RNA, wholly or partially synthetic, as appropriate.

The present invention also encompasses the expression product of any ofthe nucleic acid sequences disclosed and methods of making theexpression product by expression from encoding nucleic acid thereforunder suitable conditions in suitable host cells. Those skilled in theart are well able to construct vectors and design protocols forexpression and recovery of products of recombinant gene expression.Suitable vectors can be chosen or constructed, containing appropriateregulatory sequences, including promoter sequences, terminatorfragments, polyadenylation sequences, enhancer sequences, marker genesand other sequences as appropriate. For further details see, forexample, Molecular Cloning: a Laboratory Manual: 2nd edition, Sambrooket al, 1989, Cold Spring Harbor Laboratory Press. Transformationprocedures depend on the host used, but are well known.

The present invention further encompasses a plant comprising a plantcell comprising nucleic acid according to the present invention, andselfed or hybrid progeny and any descendant of such a plant, also anypart or propagule of such a plant, progeny or descendant, includingseed.

A further aspect of the present invention provides a method ofidentifying and cloning CO homologues from plant species other thanArabidopsis thaliana which method employs a nucleotide sequence derivedfrom that shown in FIG. 1. The genes whose sequences are shown in FIG. 5and FIG. 6 were cloned in this way. Sequences derived from these maythemselves be used in identifying and in cloning other sequences. Thenucleotide sequence information provided herein, or any part thereof,may be used in a data-base search to find homologous sequences,expression products of which can be tested for ability to influence aflowering characteristic. These may have “CO function” or the ability tocomplement a mutant phenotype, which phenotype is delayed flowering(especially under long days), preferably the timing of flowering beingsubstantially unaffected by vernalisation, as disclosed herein.Alternatively, nucleic acid libraries may be screened using techniqueswell known to those skilled in the art and homologous sequences therebyidentified then tested.

The present invention also extends to nucleic acid encoding a COhomologue obtained using a nucleotide sequence derived from that shownin FIG. 1. CO homologue sequences are shown in FIGS. 5 and 6. Alsoencompassed by the invention is nucleic acid encoding a CO homologueobtained using a nucleotide sequence derived from a sequence shown inFIG. 5 or FIG. 6.

The CO protein contains an arrangement of cysteines at the amino end ofthe protein that is characteristic of zinc fingers, such as thosecontained within the GATA transcription factors (discussed by Ramain etal, 1993; Sánchez-Garciá and Rabbitts, 1994). Seven independentlyisolated co mutants have been described, and we have identified thesequence changes causing a reduction in CO activity in all seven cases.Five of them have alterations within regions proposed from theirsequence to form zinc fingers, and the other two have changes inadjacent amino acids at the carboxy terminus of the protein. Thepositions of these alterations support our interpretation that COencodes a protein containing zinc fingers that probably binds DNA andacts as a transcription factor.

The provision of sequence information for the Co gene of Arabidopsisthaliana enables the obtention of homologous sequences from other plantspecies. In Southern hybridization experiments a probe containing the COgene of Arabidopsis thaliana hybridises to DNA extracted from Brassicanigra, Brassica napus and Brassica oleraceae. Different varieties ofthese species display restriction fragment length polymorphisms whentheir DNA is cleaved with a restriction enzyme and hybridised to a COprobe. These RFLPs may then be used to map the CO gene relative to otherRFLPs of known position. In this way for example, three CO genehomologues were mapped to linkage groups N5, N2 and N12 of Brassicanapus (D. Lydiate, unpublished). The populations used for RFLP mappinghad previously been scored for flowering time and it was demonstratedthat particular alleles of the CO homologues segregated together withallelic variations affecting flowering time. The loci mapped to linkagegroups N2 and N12 showed the most extreme allelic variation forflowering time.

Successful cloning of two Brassica napus homologues is described inExample 5.

This confirms that genes homologous to the CO gene of Arabidopsisregulate flowering time in other plant species.

Thus, included within the scope of the present invention are nucleicacid molecules which encode amino acid sequences which are homologues ofCO of Arabidopsis thaliana. Homology may be at the nucleotide sequenceand/or amino acid-sequence level. Preferably, the nucleic acid sequenceshares homology with the sequence encoded by the nucleotide sequence ofFIG. 1, preferably at least about 50%, or 60%, or 70%, or 80% homology,most preferably at least 90% homology, from species other thanArabidopsis thaliana and the encoded polypeptide shares a phenotype withthe Arabidopsis thaliana CO gene, preferably the ability to influencetiming of flowering. These may promote or delay flowering compared withArabidopsis thaliana CO and mutants, derivatives or alleles may promoteor delay flowering compared with wild-type.

CO gene homologues may also be identified from economically importantmonocotyledonous crop plants such as rice and maize. Although genesencoding the same protein in monocotyledonous and dicotyledonous plantsshow relatively little homology at the nucleotide level, amino acidsequences are conserved. In public sequence databases we recentlyidentified several Arabidopsis cDNA clone sequences that were obtainedin random sequencing programmes and share homology with CO in regions ofthe protein that are known to be important for its activity. Similarly,among randomly sequenced rice cDNAs we identified one clone that sharedrelatively little homology to CO at the DNA level but showed highhomology at the amino acid level. This clone, and another one that wehave identified from maize, may be used to to identify the whole CO genefamily from rice and other cereals. By sequencing each of these clones,studying their expression patterns and examining the effect of alteringtheir expression, genes carrying out a similar function to CO inregulating flowering time are obtainable. Of course, mutants,derivatives and alleles of these sequences are included within the scopeof the present invention in the same terms as discussed above for theArabidopsis thaliana CO gene.

Nucleic acid according to the invention may comprise a nucleotidesequence encoding a polypeptide able to complement a mutant phenotypewhich is delayed flowering, the timing of flowering being substantiallyunaffected by vernalisation. The delayed flowering may be under longdays. Also the present invention provides nucleic acid comprising anucleotide sequence which is a mutant or derivative of a wild-type geneencoding a polypeptide with ability to influence the timing offlowering, the mutant or derivative phenotype being early or delayedflowering with the timing of flowering being substantially unaffected byvernalisation. These are distinguished from the LD gene reported by Leeet al.

Vernalisation is low-temperature (usually just above 0° C.) treatment ofplant (seedlings) or seed for a period of usually a few weeks, probablyabout 30 days. It is a treatment required by some plant species beforethey will break bud or flower, simulating the effect of winter cold.

Also according to the invention there is provided a plant cell havingincorporated into its genome a sequence of nucleotides as provided bythe present invention, under operative control of a regulatory sequencefor control of expression. A further aspect of the present inventionprovides a method of making such a plant cell involving introduction ofa vector comprising the sequence of nucleotides into a plant cell andcausing or allowing recombination between the vector and the plant cellgenome to introduce the sequence of nucleotides into the genome.

Plants which comprise a plant cell according to the invention are alsoprovided, along with any part or propagule thereof, seed, selfed orhybrid progeny and descendants.

The invention further provides a method of influencing the floweringcharacteristics of a plant comprising expression of a heterologous COgene sequence (or mutant, allele, derivative or homologue thereof, asdiscussed) within cells of the plant. The term “heterologous” indicatesthat the gene/sequence of nucleotides in question have been introducedinto said cells of the plant using genetic engineering, ie by humanintervention. The gene may be on an extra-genomic vector orincorporated, preferably stably, into the genome. The heterologous genemay replace an endogenous equivalent gene, ie one which normallyperforms the same or a similar function in control of flowering, or theinserted sequence may be additional to the endogenous gene. An advantageof introduction of a heterologous gene is the ability to placeexpression of the gene under the control of a promoter of choice, inorder to be able to influence gene expression, and therefore flowering,according to preference. Furthermore, mutants and derivatives of thewild-type gene, eg with higher or lower activity than wild-type, may beused in place of the endogenous gene.

The principal flowering characteristic which may be altered using thepresent invention is the timing of flowering. Under-expression of thegene product of the CO gene leads to delayed flowering (as suggested bythe co mutant phenotype); over-expression may lead to precociousflowering (as demonstrated with transgenic Arabidopsis plants carryingextra copies of the CO gene and by expression from CaMV 35S promoter).This degree of control is useful to ensure synchronous flowering of maleand female parent lines in hybrid production, for example. Another useis to advance or retard the flowering in accordance with the dictates ofthe climate so as to extend or reduce the growing season. This mayinvolve use of anti-sense or sense regulation.

A second flowering characteristic that may be altered is thedistribution of flowers on the shoot. In Arabidopsis, flowers develop onthe sides but not at the apex of the shoot. This is determined by thelocation of expression of the LEAFY genes (Weigel et al., 1992), andmutations such as terminal flower (Shannon and Meeks-Wagner, 1991) thatcause LEAFY to be expressed in the apex of the shoot also lead toflowers developing at the apex. There is evidence that CO is requiredfor full activity of LEAFY (Putterill et al., 1995), and therefore byincreasing or altering the pattern of CO expression the level andpositions of LEAFY expression, and therefore of flower development, mayalso be altered. This is exemplified in Example 4. This may be employedadvantageously in creating new varieties of horticultural species withaltered arrangements of flowers.

The nucleic acid according to the invention, such as a CO gene orhomologue, may be placed under the control of an externally induciblegene promoter to place the timing of flowering under the control of theuser. The use of an inducible promoter is described below. This isadvantageous in that flower production, and subsequent events such asseed set, may be timed to meet market demands, for example, in cutflowers or decorative flowering pot plants. Delaying flowering in potplants is advantageous to lengthen the period available for transport ofthe product from the producer to the point of sale and lengthening ofthe flowering period is an obvious advantage to the purchaser.

The term “educible” as applied to a promoter is well understood by thoseskilled in the art. In essence, expression under the control of aninducible promoter is “switched on” or increased in response to anapplied stimulus. The nature of the stimulus varies between promoters.Some inducible promoters cause little or undetectable levels ofexpression (or no expression) in the absence of the appropriatestimulus. Other inducible promoters cause detectable constitutiveexpression in the absence of the stimulus. Whatever the level ofexpression is in the absence of the stimulus, expression from anyinducible promoter is increased in the presence of the correct stimulus.The preferable situation is where the level of expression increases uponapplication of the relevant stimulus by an amount effective to alter aphenotypic characteristic. Thus an inducible (or “switchable”) promotermay be used which causes a basic level of expression in the absence ofthe stimulus which level is too low to bring about a desired phenotype(and may in fact be zero). Upon application of the stimulus, expressionis increased (or switched on) to a level which brings about the desiredphenotype.

Suitable promoters include the Cauliflower Mosaic Virus 35S (CaMV 35S)gene promoter that is expressed at a high level in virtually all planttissues (Benfey et al, 1990a and 1990b); the maizeglutathione-S-transferase isoform II (GST-II-27) gene promoter which isactivated in response to application of exogenous safener (WO93/01294,ICI Ltd); the cauliflower meri 5 promoter that is expressed in thevegetative apical meristem as well as several well localised positionsin the plant body, eg inner phloem, flower primordia, branching pointsin root and shoot (Medford, 1992; Medford et al, 1991) and theArabidopsis thaliana LEAFY promoter that is expressed very early inflower development (Weigel et al, 1992).

When introducing a chosen gene construct into a cell, certainconsiderations must be taken into account, well known to those skilledin the art. The nucleic acid to be inserted should be assembled within aconstruct which contains effective regulatory elements which will drivetranscription. There must be available a method of transporting theconstruct into the cell. Once the construct is within the cell membrane,integration into the endogenous chromosomal material either will or willnot occur. Finally, as far as plants are concerned the target cell typemust be such that cells can be regenerated into whole plants.

Plants transformed with a DNA segment containing the sequence may beproduced by standard techniques for the genetic manipulation of plants.DNA can be transformed into plant cells using any suitable technology,such as a disarmed Ti-plasmid vector carried by Agrobacterium exploitingits natural gene transfer ability (EP-A-270355, EP-A-0116718, NAR 12(22)8711-87215 1984), particle or microprojectile bombardment (U.S. Pat. No.5,100,792, EP-A-444882, EP-A-434616) microinjection (WO 92/09696, WO94/00583, EP 331083, EP 175966), electroporation (EP 290395, WO 8706614)or other forms of direct DNA uptake (DE 4005152, WO 9012096, U.S. Pat.No. 4,684,611). Agrobacterium transformation is widely used by thoseskilled in the art to transform dicotyledonous species. AlthoughAgrobacterium has been reported to be able to transform foreign DNA intosome monocotyledonous species (WO 92/14828), microprojectilebombardment, electroporation and direct DNA uptake are preferred whereAgrobacterium is inefficient or ineffective. Alternatively, acombination of different techniques may be employed to enhance theefficiency of the transformation process, eg bombardment withAgrobacterium coated microparticles (EP-A-486234) or microprojectilebombardment to induce wounding followed by co-cultivation withAgrobacterium (EP-A-486233).

The particular choice of a transformation technology will be determinedby its efficiency to transform certain plant species as well as theexperience and preference of the person practising the invention with aparticular methodology of choice. It will be apparent to the skilledperson that the particular choice of a transformation system tointroduce nucleic acid into plant cells is not essential to or alimitation of the invention.

In the present invention, over-expression may be achieved byintroduction of the nucleotide sequence in a sense orientation. Thus,the present invention provides a method of influencing a floweringcharacteristic of a plant, the method comprising causing or allowingexpression of the polypeptide encoded by the nucleotide sequence ofnucleic acid according to the invention from that nucleic acid withincells of the plant. (See Example 4.)

Under-expression of the gene product polypeptide may be achieved usinganti-sense technology or “sense regulation”. The use of anti-sense genesor partial gene sequences to down-regulate gene expression is nowwell-established. DNA is placed under the control of a promoter suchthat transcription of the “anti-sense” strand of the DNA yields RNAwhich is complementary to normal mRNA transcribed from the “sense”strand of the target gene. For double-stranded DNA this is achieved byplacing a coding sequence or a fragment thereof in a “reverseorientation” under the control of a promoter. The complementaryanti-sense RNA sequence is thought then to, bind with mRNA to form aduplex, inhibiting translation of the endogenous mRNA from the targetgene into protein. Whether or not this is the actual mode of action isstill uncertain. However, it is established fact that the techniqueworks. See, for example, Rothstein et al, 1987; Smith et al, 1988; Zhanget al, 1992.

Thus, the present invention also provides a method of influencing aflowering characteristic of a plant, the method comprising causing orallowing anti-sense transcription from nucleic acid according to theinvention within cells of the plant.

When additional copies of the target gene are inserted in sense, that isthe same, orientation as the target gene, a range of phenotypes isproduced which includes individuals where over-expression occurs andsome where under-expression of protein from the target gene occurs. Whenthe inserted gene is only part of the endogenous gene the number ofunder-expressing individuals in the transgenic population increases. Themechanism by which sense regulation occurs, particularlydown-regulation, is not well-understood. However, this technique is alsowell-reported in scientific and patent literature and is used routinelyfor gene control. See, for example, van der Krol, 1990; Napoli et al,1990; Zhang et al, 1992.

Thus, the present invention also provides a method of influencing aflowering characeristic of a plant, the method comprising causing orallowing expression from nucleic acid according to the invention withincells of the plant. This may be used to suppress activity of apolypeptide with ability to influence a flowering characteristic. Herethe activity of the polypeptide is preferably suppressed as a result ofunder-expression within the plant cells.

As stated above, the expression pattern of the CO gene may be altered byfusing it to a foreign promoter. For example, International patentapplication WO93/01294 of Imperial Chemical Industries Limited describesa chemically inducible gene promoter sequence isolated from a 27 kDsubunit of the maize glutathione-S-transferase, isoform II gene(GST-II-27) (see FIG. 2). It has been found that when linked to anexogenous gene and introduced into a plant by transformation, theGST-II-27 promoter provides a means for the external regulation of theexpression of that exogenous gene. The structural region of the CO geneis fused to the GST-II-27 promoter downstream of the translation startpoint shown in FIG. 2.

The GST-II-27 gene promoter has been shown to be induced by certainchemical compounds which can be applied to growing plants. The promoteris functional in both monocotyledons and dicotyledons. It can thereforebe used to control gene expression in a variety of genetically modifiedplants, including field crops such as canola, sunflower, tobacco,sugarbeet, cotton; cereals such as wheat, barley, rice, maize, sorghum;fruit such as tomatoes, mangoes, peaches, apples, pears, strawberries,bananas, and melons; and vegetables such as carrot, lettuce, cabbage andonion. The GST-II-27 promoter is also suitable for use in a variety oftissues, including roots, leaves, stems and reproductive tissues.

Accordingly, the present invention provides in a further aspect a geneconstruct comprising an inducible promoter operatively linked to anucleotide sequence provided by the present invention, such as the COgene of Arabidopsis thaliana, a homologue from another plant species orany mutant, derivative or allele thereof. This enables control ofexpression of the gene. The invention also provides plants transformedwith said gene construct and methods comprising introduction of such aconstruct into a plant cell and/or induction of expression of aconstruct within a plant cell, by application of a suitable stimulus, aneffective exogenous inducer. The promoter may be the GST-II-27 genepromoter or any other inducible plant promoter.

Promotion of Co Activity to Cause Early Flowering

Mutations that reduce CO activity cause late flowering under inductivelong day conditions, indicating CO involvement in promoting floweringunder long days. It is probably not required under non-inductive shortdays because co mutations have no effect on flowering time under theseconditions. The CO transcript is present at very low abundance underlong days and has only been detected by using PCR to amplify cDNA. Theobservation that some transgenic plants harbouring a T-DNA containing COflowered slightly earlier than wild type under long days andconsiderably earlier than wild type under short days, suggests that,particularly under non-inductive short days, the level of the COtranscript is limiting on flowering time. This suggests that floweringcould be manipulated by using foreign promoters to alter the expressionof the gene:

Causing Early Flowering under Non-Inductive Conditions

Manipulation of CO transcript levels under non-inductive conditions maylead to early, or regulated, flowering. Promoter fusions such as thosedisclosed herein enable expression of CO mRNA at a higher level thanthat found in wild-type plants under non-inductive conditions. Use ofCaMV35S or meri 5 fusions leads to early flowering while use of GSTIIfusions leads to regulated flowering.

Causing Early Flowering under Inductive Conditions

Wild-type Arabidopsis plants flower extremely quickly under inductiveconditions and the CO gene is expressed prior to flowering, although ata low level. Nevertheless, some transgenic wild-type plants containingextra copies of CO have been shown to flower slightly earlier thanwild-type plants. The level of the CO product may be increased byintroduction of promoter, eg CaMV35S or meri 5, fusions. Induciblepromoters, such as GSTII, may be used to regulate flowering, eg by firstcreating a CO mutant of a particular species and then introducing aninducible promoter-CO fusion capable of complementation of the mutationin a regulated fashion.

Inhibition of Co Activity to Cause Late Flowering

Co mutations cause late flowering of Arabidopsis. Transgenic approachesmay be used to reduce CO activity and thereby delay or prevent floweringin a range of plant species. A variety of strategies may be employed.

Expression of sense or anti-sense RNAs

In several cases the activity of endogenous plant genes has been reducedby the expression of homologous antisense RNA from a transgene, asdiscussed above. Similarly, the expression of sense transcripts from atransgene may reduce the activity of the corresponding endogenous copyof the gene, as discussed above. Expression of a CO antisense or senseRNA should reduce activity of the endogenous gene and cause lateflowering.

Expression of Modified Versions of the CO Protein

Transcription factors and other DNA binding proteins often have amodular structure in which amino acid sequences required for DNAbinding, dimerisation or transcriptional activation are encoded byseparate is domains of the protein (Reviewed by Ptashne and Gann, 1990).This permits the construction of truncated or fusion proteins thatdisplay only one of the functions of the DNA binding protein. In thecase of CO, modification of the gene in vitro and expression of modifiedversions of the protein may lead to dominant inhibition of theendogenous, intact protein and thereby delay flowering. This may beaccomplished in various ways, including the following:

Expression of a truncated CO protein encoding only the DNA bindingregion.

The zinc-finger containing region of CO may be required and sufficientto permit binding to DNA. If a truncated or mutated protein that onlyencodes the DNA binding region were expressed at a higher level than theendogenous protein, then most of the CO binding sites should be occupiedby the mutated version thereby preventing binding of the fully activeendogenous protein. Binding of the mutant protein would have the effectof preventing CO action, because the mutated protein would not containany other regions of CO that might be involved in biological processessuch as transcriptional activation, transcriptional inhibition orprotein-protein interaction.

In vitro analysis of a murine transcription factor GF1 that containszinc-fingers similar to those of CO, suggests that a truncated COprotein with the properties described above could be designed. Martinand Orkin (1990) demonstrated that a truncated version of GF1 containingonly the zinc fingers retained DNA binding activity, but was incapableof transcriptional activation. Similarly, the zinc-finger containingPANNIER protein of Drosophila melanogaster is required to repressactivation of genes required for bristle formation. Mutations in adomain that does not contain the zinc fingers caused dominantsuper-repression of gene activity, probably because these proteins bindDNA but no not interact with other proteins in the way that thewild-type protein does (Ramain et al, 1993).

Expression of a Mutant CO Protein Not Encoding the DNA Binding Domain

A second form of inhibitory molecule may be designed if CO mustdimerise, or form complexes with other proteins, to have its biologicaleffect, and if these complexes can form without a requirement for CObeing bound to DNA. In this case expression of a CO protein that ismutated within the DNA-binding domain, but contains all of the otherproperties of the wild-type protein, would have an inhibitory effect. Ifthe mutant protein were present at a higher concentration than theendogenous protein and CO normally forms dimers, then most of theendogenous protein would form dimers with the mutant protein and wouldnot bind DNA. Similarly, if CO forms complexes with other proteins, thenthe mutant form of CO would participate in the majority of thesecomplexes which would then not bind DNA.

Mutant forms of DNA-binding proteins with these properties have beenreported previously. For example, in yeast cells expression of a proteincontaining the transcriptional activation domain of GAL4 was able toreduce the expression of the CYC1 gene. CYC1 is not normally activatedby GAL4, so it was proposed that the GAL4 activating domain sequestersproteins required for CYC1 activation (GIll and Ptashne, 1988).Similarly, mutations in the zinc finger region of the PANNIER protein ofDrosophila melanogaster have a dominant phenotype, probably because themutant proteins sequester proteins essential for PANNIER activity andreduce their availability to interact with wild-type protein (Ramain,1993).

Aspects and embodiments of the present invention will now beillustrated, by way of example, with reference to the accompanyingfigures. Further aspects and embodiments will be apparent to thoseskilled in the art. All documents mentioned in this text areincorporated herein by reference.

IN THE FIGURES

FIG. 1 shows a nucleotide sequence according to one embodiment of theinvention, being the sequence of the CO ORF obtained from Arabidopsisthaliana (SEQ ID NO. 1), and the predicted amino acid sequence (SEQ IDNO. 2). The nucleotide sequence is shown above the amino acid sequence.The region shown in bold is is thought to encompass both zinc fingerdomains.

FIG. 2 shows the nucleotide sequence of the GST-II-27 gene promoter (SEQID NO. 3). The fragment used to make the fusion was flanked by theHindIII and NdeI sites that are shown in bold.

FIG. 3 shows the nucleotide sequence of the genomic DNA comprising theCO gene obtained from Arabidopsis thaliana, including the single intron,promoter sequences and sequences present after the translationaltermination codon (SEQ ID NO. 4). The genomic region shown starts. 2674bp upstream of the translational start site, and ends just after thepolyadenylation site. The CO open reading frame is shown in bold, and isinterrupted by the single intron.

FIG. 4 shows the pJIT62 plasmid used as a source of the CaMV 35Spromoter. The KpnI-HindIII fragment, shown as a dark coloured thickline, was used as a source of the promoter.

FIG. 5 shows a nucleotide sequence according to a further embodiment ofthe invention, being a CO ORF obtained from Brassica napus (SEQ ID NO.5), and the predicted amino acid sequence (SEQ ID NO. 6).

FIG. 6 shows a nucleotide sequence according to a further embodiment ofthe invention, being a second CO ORF obtained from Brassica napus (SEQID NO. 7), and the predicted amino acid sequence (SEQ ID NO. 8).

EXAMPLE 1 Cloning and Analysis of a CO Gene

Cosmid and RFLP Markers.

DNA of λ CHS2 was obtained from R. Feinbaum (Massachusetts GeneralHospital (MGH), Boston). Total DNA was used as radiolabelled probe toYAC library colony filters and plant genomic DNA blots. Cosmids g6833,17085, 17861, 19027, 16431, 14534, g5962 and g4568 were obtained fromBrian Hauge (MGH, Boston), cultured in the presence of 30 μg/mlkanamycin, and maintained as glycerol stocks at −70° C. Total cosmid DNAwas used as radiolabelled probe to YAC library colony filters and plantgenomic DNA blots. Cosmid pCIT1243 was provided by Elliot Meyerowitz(Caltech, Pasadena), cultured in the presence of 100 μg/mlstreptomycin/spectinomycin and maintained as a glycerol stock at −70° C.pCIT30 vector sequences share homology to pYAC4 derived vectors, andtherefore YAC library colony filters were hybridised with insert DNAextracted from the cosmid. Total DNA of pCIT1243 was used asradiolabelled probe to plant genomic DNA blots.

YAC Libraries.

The EG, abi and S libraries were obtained from Chris Somerville(Michigan State University). The EW library was obtained from Jeff Dangl(Max Delbruck Laboratory, Cologne) and the Yup library from Joe Ecker(University of Pennsylvania). Master copies of the libraries were storedat −70° C. (as described by Schmidt et al. Aust. J. Plant Physiol. 19:341-351 (1992)). The working stocks were maintained on selectiveKiwibrew agar at 4° C. Kiwibrew is a selective, complete minimal mediumminus uracil, and containing 11% Casamino acids. Working stocks of thelibraries were replated using a 96-prong replicator every 3 months.

Yeast Colony Filters.

Hybond-N (Amersham) filters (8 cm×11 cm) containing arrays of yeastcolony DNA from 8-24 library plates were produced and processed (asdescribed by Coulson et al. Nature 335:184-186 (1988) and modified (asdescribed by Schmidt and Dean Genome Analysis, vol. 4: 71-98 (1992)).Hybridisation and washing conditions were according to themanufacturer's instructions. Radiolabelled probe DNA was prepared byrandom-hexamer labelling.

Yeast Chromosome Preparation and Fractionation by Pulsed Field GelElectrophoresis (PFGE).

Five millilitres of Kiwibrew was inoculated with a single yeast colonyand cultured at 30° C. for 24 h. Yeast spheroplasts were generated byincubation with 2.5 mg/ml Novozym (Novo Biolabs) for 1 h at roomtemperature. Then 1 M sorbitol was added to bring the final volume ofspheroplasts to 50 μl. Eighty microlitres of molten LMP agarose (1%InCert agarose, FMC) in 1 M sorbitol was added to the spheroplasts, themixture was vortexed briefly and pipetted into plug moulds. Plugs wereplaced into 1.5 ml Eppendorf tubes and then incubated in 1 ml of 1 mg/mlProteinase K (Boehringer Mannheim) in 100 mMEDTA, pH 8, 1% Sarkosyl for4 h at 50° C. The solution was replaced and the plugs incubatedovernight. The plugs were washed three times for 30 min each with TE andtwice for 30 min with 0.5×TVBE. PFGE was carried out using the Pulsaphorsystem (LKB). One-third of a plug was loaded onto a 1% agarose gel andelectrophoresed in 0.5×TBE at 170 V, 20 s pulse time, for 36 h at 4° C.DNA markers were concatemers of λ DNA prepared as described by Bancroftand Wolk, Nucleic A Res. 16:7405-7418 (1988). DNA was visualised bystaining with ethidium bromide.

Yeast Genomic DNA for Restriction Enzyme Digestion and InversePolymerase Chain Rection (IPCR).

Yeast genomic DNA was prepared essentially as described by Heard et al.(1989) except that yeast spheroplasts were prepared as above. Finally,the DNA was extracted twice with phenol/chloroform, once with chloroformand ethanol precipitated. The yield from a 5 ml culture was about 10 μgDNA.

Isolation of YAC end Fragments by IPCR.

Yeast genomic DNA (100 ng) was digested with AluI, HaeIII, EcoRV orHincII. The digestions were phenol-chloroform extracted once and thenethanol precipitated. The DNA fragments were circularised by ligation ina volume of 100 μl over-night at 16° C. in the presence of 2 U ligase(BRL). After incubation of the ligation mixture at 65° C. for 10 min,IPCR was carried out on 10 μl ligation mixture using inverse primerpairs. The IPCR conditions and C and D primer pairs have been describedby Schmidt et al. (1992). The JP series are from M. Hirst (IMM MolecularGenetics Group, Oxford).

After digestion with the indicated enzymes, the following primer pairswere used:

For left-end IPCR: AluI, EcoRV; D71 5′tcctgctcgcttcgctactt3′ and C785′gcgatgctgtcggaatggac3′ HaeIII; JP1 5′aagtactctcggtagccaag3′ and JP5.5′gtgtggtcgccatgatcgcg3′

For right-end IPCR: AluI, HincII; C69 5′ctgggaagtgaatggagacata3′ and C705′aggagtcgcataagggagag3′ HaeIII; C69 and JP4. 5′ttcaagctctacgccgga3′

Aliquots of the IPCR reactions were checked by electrophoresis on a 1.5%agarose gel and the 1 μl of the reaction was re-amplified by PCR usingthe conditions and F primer series recommended by I. Hwang (MGH,Boston). Conditions for re-amplification were the same as for IPCR,except that 30 cycles (1 min, 94° C.; 1 min, 45° C.; and 3 min, 72° C.)were used. The F primers anneal very near the cloning site and so reducethe amount of vector sequence present in the PCR product. In additionthey introduce a FokI site very close to the destroyed cloning site ofEW and S YACs.

The primers used for re-amplification of left-end IPCR products were asfollows:

For EG, abi and S YACs: AluI, F2 5′acgtcggatgctcactatagggatc3′ and C77;5′gtgataaactaccgcattaaagc3′

-   -   HaeIII, F2 and JP5; EcoRV, F2 and 78.

For EW and Yup YACs: AluI, F6 5′acgtcggatgactttaatttattcacta3′

-   -   and C77; HaeIII, F6 and JP5; EcoRV, F6 and C78.

The following primers were used for re-amplification of the right-endIPCR products:

For EG, abi and S YACs: AluI, F3 5′gacgtggatgctcactaaagggatc3′ and C71;5′agagccttcaacccagtcag3′ HaeIII, F3

-   -   and JP4; HincII, F3 and C70.

For EW and YUP YACS: AluI, F7 5′acgtcggatgccgatctcaagatta3′

-   -   and C77; HaeIII, F7 and JP4; 4HincII, F7 and C70.

The resulting PCR product was purified by cleaving with the enzymeoriginally used in the digestion together with BamHI (EG and abi YACs)or EcoRI (Yup YACs) and separated on 1% LMP agarose gels. The YAC endprobes were radiolabelled using random priming in molten agarose, and inappropriate cases digested with FokI to remove vector sequences and thenused as hybridisation probes.

Isolation of YAC Left-End Probes by Plasmid Rescue.

Plasmid rescue of YAC left-end fragments from EG, abi and EW YACs wascarried out as described by Schmidt et al. (1992).

Isolation of Plant Genomic DNA.

Plant genomic DNA was isolated from glasshouse grown plants essentiallyas described by Tai and Tanksley, Plant Mol. Biol. Rep. 8: 297-303(1991), except that the tissue was ground in liquid nitrogen and theRNase step omitted. Large-scale (2.5-5 g leaves) and miniprep (3-4leaves) DNA was prepared using this method.

Gel Blotting and Hybridisation Conditions.

Gel transfer to Hybond-N, hybridisation and washing conditions wereaccording to the manufacturers instructions, except that DNA was fixedto the filters by UV Stratalinker treatment (1200 μJ×100; Stratagene)and/or baked at 80° C. for 2 h. Radiolabelled DNA was prepared by randomhexamer labelling.

RFLP Analysis.

Two to three micrograms of plant genomic DNA was prepared from theparental plants used in the crosses and cleaved in a 300 μl volume with1 of 17 restriction enzymes: DraI, BclI, CfoI, EcoRI, EcoRV, HincII,BglIII, RsaI, BamHI, HindIII, SacI, AluI, HinfI, Sau3A, TaqI and MboI.The digested DNA was ethanol precipitated and separated on 0.7% agarosegels and blotted onto Hybond-N filters. Radiolabelled cosmid λ or YACend probe DNA was hybridised to the filters to identify RFLPs.

Selection of Plants Carrying Recombination Events in the Vicinity of Co.

The first step in selecting recombinants was to create lines carryingthe co mutation and closely linked markers. This was done twice fordifferent flanking markers. In the first experiment a Landsberg erectaline carrying the co-2 allele (Koornneef et al. 1991) and tt4 was made.The tt4 mutation prevents the production of anthocyanin and haspreviously been suggested to be a lesion in the gene encoding chalconesynthase, because this map to a similar location (Chang et al. 1988).The double mutant was crossed to an individual of the Niederzenz ecotypeand the resulting hybrid self-fertilised to produce an F₂ population.This population was then screened phenotypically for individuals inwhich recombination had occurred between co-2 and tt4. In addition, F₂plants homozygous for both mutations were used to locate marker RFLPg4568 relative to co-2.

The second experiment was performed by using two marked lines asparents. The first of these contained chp7 in a Landsberg erectabackground and was derived by Maarten Koornneef (Wageningen) from across between a line of undefined background (obtained from GeorgeRedei) to Landsberg erecta. The second parent contained markers lu andalb2. This was selected by Maarten Koornneef from a cross of a plant ofS96 background carrying the alb2 mutation (M4-6-18; Relichová 1976) to aline containing co-1 and lu (obtained by Koornneef from J. Relichová,but originally from Cr. Rédei). The chp7, co-1 line was then crossed tothe lu, alb2 line and an F₂ population derived by self-fertilisation ofthe hybrid. This population was used to isolate the recombinants withcrossovers between lu and co-1 and between co-1 and alb2. Both classesof recombinants were recognised phenotypically as lu homozygotes. Theseare only present if recombination occurs between lu and alb2, becausealb2 is lethal when homozygous.

Isolation of the CO (FG) Locus:

The CO gene is located on the upper arm of chromosome 5 and is 2 cMproximal to tt4. The average physical distance in 1 cM in Arabidopsis isapproximately 140 kb. The distance from CHS to CO might be expectedtherefore to be ca. 300 kb.

We started by hybridising 4 RFLP markers that are closely linked (withinca. 2 cM) to CHS to the EG and EW YAC libraries. This produced 18hybridising YACs. These were run out on pulse field gels, Southernblotted and hybridised to the appropriate RFLP clone. This confirmed thecolony hybridisation result and measured the size of the YACs; theyranged from 50 kb to 240 kb in size. The YACs were then digested withrestriction enzymes, hybridised to RFLP marker DNA and the pattern offragments compared to that of the marker. This allowed us to determinewhether they contained all the fragments in the RFLP marker or only someof them and permitted us to deduce how the YACs lay in relation to eachother. In most cases this arrangement was later confirmed by theisolation of inverse polymerise chain reaction (PCR) generated fragmentswhich are located at the ends of the Arabidopsis DNA inserted within theYAC, and hybridisation of these to the appropriate overlapping YACs.

The short contigs around the RFLP markers were than extended. Weobtained two sets of overlapping cosmid clones from this area and usedthe appropriate ones against the YAC libraries. This identified two newYACs. End probes derived from most of the 20 YACs we had identified werethen used to screen the libraries and new YACs extending the clonedregion in both directions were identified. In all a detailed analysis of67 YACs was necessary. It allowed us to assemble one contiguous segmentof Arabidopsis DNA which includes RFLP markers 6833, CHS, pCIT1243 and5962 and is approximately 1700 kb long.

The location of CO within the contig was determined by detailed RFLPanalysis after the isolation of recombinants containing cross-overs veryclosely linked to CO. The recombinants were identified by using flankingphenotypic markers. First we made a Landsberg erecta chromosome markedwith co and tt4. Then we crossed this to Niedersenz and screened 1200 F2plants for recombinant chromosomes carrying crossovers between co andtt4. In this way we found twelve recombinants which were confirmed byscoring the phenotypes of their progeny. The rarity of theserecombinants confirmed the extremely close linkage between tt4 and co.These recombinants were then used to locate CO on the contig. Forexample, some of them contain Landsberg DNA on the tt4 side of thecross-over and Niedersenz DNA on the co side. DNA isolated during ourwalk was positioned relative to CO by using small fragments as RFLPmarkers and hybridising them to the DNA extracted from the recombinants.We used a similar approach on the proximal side by screening forrecombinants between co and alb2. This work initially located CO betweentwo YAC end probes which are approximately 300 kb apart.

To locate CO more accurately within the 300 kb, more cross overs betweenco and the flanking phenotypic markers were screened for. Using asimilar rationale as that described earlier, a total of 46 cross-oversbetween co and alb2 (an interval of 1.6 cM proximal to CO), and 135between co and lu (an interval of 5.3 cM distal to CO) were identifiedand analysed with appropriate RFLP markers derived from our contig. Thislocated the gene to a very short region defined by two YAC end probes.These were used to screen a cosmid library provided to us by Universityof Minnesotta, and a short cosmid contig containing 3 cosmids thatspanned the entire region was constructed. Analysis of these cosmidsindicated that the detailed RFLP mapping had located CO to a regionapproximately 38 kb long.

To position the gene within the cosmids, each of them was introducedinto co mutants and the resulting plants examined to determine which ofthe cosmids corrected the co mutant phenotype. Roots of plantshomozygous for co-2 and tt4 mutations were co-cultivated withAgrobacterium strains containing each cosmid (Olszewski and Ausubel,19.88; Valvekens et al 1989) and kanamycin resistant plants regenerated.The regenerants (T1 generation) were self-fertilised and their progenysown on medium containing kanamycin to confirm that they contained theT-DNA (Table 1).

A total of 5 independent transformants containing cosmid A, 9 containingcosmid B and 13 containing cosmid C produced kanamycin resistant T2progeny and were studied further. The flowering time of 20-40 plantsfrom each of these T2 families was measured in the long day greenhouse.All of the progeny of transgenic plants made with cosmid A flowered aslate as the co-2 mutants, suggesting that this cosmid did not containthe CO gene. However, several of the families derived from plantscontaining cosmids B and C included early flowering individuals. Intotal, 6 of the 9 families derived from plants harboring cosmid B and 12of the 13 derived from those carrying cosmid C contained plants thatflowered as early as wild-type. All of these early-flowering individualsproduced light coloured seeds indicating that they carried the tt4mutation present in the line used for the transformation, and thereforewere not simply the result of the experiment being contaminated withseeds of wild-type plants (Experimental Procedures). These resultsstrongly suggest that the Co gene is contained in both cosmids B and C.

Further experiments were carried out in the T3 generation to confirm thecomplementation results. A total of five T2 early-flowering plantsderived from cosmid B and six from cosmid C were self fertilised andstudied further in the T3 generation. Each of the T2 plants chosen forthis analysis was derived from a different transformant, was theearliest flowering plant in the T2 family and was a member of a familythat had shown a ratio of 3 kanamycin resistant seedlings for eachkanamycin sensitive, and therefore probably contained the transgene atonly one locus (Table 1). All of the seedlings in these T3 families wereresistant to kanamycin demonstrating that the parental T2 plants werehomozygous for the T-DNA. This demonstrated that the earliest floweringT2 plants were homozygous for the CO transgene.

Under the long-day conditions used the co-2 mutant plants floweredconsiderably later than the wild-type controls (Table 1). The T3 plantsflowered at least as early as wild-type under defined long-dayconditions, and some individuals flowered earlier than wild-type (Table1). This analysis confirmed that cosmids B and C can correct the effectof the co-2 mutation on flowering time under long days, suggesting thatboth of these cosmids contained CO, and therefore that the gene was inthe region of overlap between them. This region was 6.5 kb long.

We determined the sequence of the 6.5 kb that was shared by cosmids Band C. This contains only one gene that we can readily identify from theDNA sequence. The polymerase chain reaction was used to amplify thisgene from three independently isolated co mutants, and sequencing ofthese genes demonstrated that all three contained mutations. This,together with the complementation analysis, is conclusive evidence thatthis is the CO gene. The predicted amino acid sequence of CO shows nohomology to previously reported genes. However, the amino terminuscontains two regions that are predicted to form zinc fingers, suggestingthat the protein product binds to DNA and is probably a transcriptionfactor.

Unexpected Difficulties in Identifying CO within the 300 kb RegionDefined by REG17B5 and LEW4A9

1. Locating the Gene by More Detailed RFLP Mapping and Complementation

As mentioned, Putterill et al, Mol. Gen. Genet. 239:145-157 (1993)described location of CO to within a region of 300 kb. To locate CO moreaccurately by RFLP mapping, two materials were required: morerecombinants carrying cross-overs within the 300 kb region, and moreRFLP markers to use as probes against these recombinants.

Recombinants between lu and co or between co and alb2 were selected. Atotal of 68 cross-overs in the 1.6 cM between lu and co were identified,and 128 in the 5.3 cM between co and alb2. This is equivalent to 196cross-overs in 6.8 cM, or an average of 29 cross-overs per cM. Amongthese recombinants, cross-overs within the 300 kb were unexpectedlyunder-represented: 300 kb is equivalent to around 1.5 cM, so 43 (29×1.5)cross-overs would be expecetd in this region. Only 23 were found.

The analysis of these cross-overs was also difficult because none of theYAC end probes that fell within the 300 kb could be used as RFLP probes.This was due to none of them detecting RFLPs between the parental linesused to make the recombinants. One RFLP marker (pCIT1243) was availablewithin the region, and when this was used to analyse the recombinants itwas found to be between REG17B5 and CO, thereby positioning the genebetween pCIT1243 and LEW4A9. However, a more accurate position of thegene could not be achieved by this method because of the lack ofsuitable probes.

The distribution of cross-overs between pCIt243 and LEW4A9 wasasymmetric: there was one between pCIT1243 and CO and 19 between CO andLEW4A9. We guessed that the gene was likely to be close to pCIT1243. Apool of probes (LEG4C9, Labil9E1, pCIT1243, LEG21H11 and REG4C9) fromthis region was therefore used to screen a cosmid library to provide aseries of cosmid clones extending from pCIT1243 towards LEW4A9. Analysisof these clones with individual probes showed that the three cosmids A,B and C extended from pCIT1243 in the direction required. These werethen used as RFLP markers and the gene demonstrated to be on thecosmids.

The procedure was therefore more complex than that envisaged in thePutterill et al paper because of the difficulty in making enoughrecombinants within the 300 kb region, and in identifying suitable RFLPmarkers.

2. Identifying the Gene by Complementation

The three cosmids A, B and C were introduced into mutant plants, and itwas shown that B and C could correct the effect of the mutation. Thegene must therefore be on the DNA shared by B and C, but the methodproposed in the Putterill paper for final identification of the CO genefailed. It had been assumed that one would be able to identify atranscript for CO by using the complementing DNA as a probe againstNorthern blots, or that one of the seven alleles would show are-arrangement on Southern blots that would lead to the gene. In fact,we could not detect the CO transcript on Northern blots nor anyre-arrangment indicative of where the gene might be

The failure of this approach led us to sequence the genomic DNA thatcomplemented the mutation. Computer analysis of this DNA identified twoopen reading frames adjacent to each other and we guessed that thesemight represent the CO gene. We still had no evidence that thes ORFswere actively transcribed, as one would expect for a gene, because notranscript was detectable on Northern blots and no cDNA was detected inseveral cDNA libraries. We therefore used the polymerase chain reaction(PCR) to amplify a cDNA from RNA preparations. This showed that thse twoORFs did indeed represent one active gene. Sequencing co alleles thenconfirmed that they contained single base changes, or in one case a 9 bpdeletion, that would not have been detected by the approaches proposedin the Putterill et al paper.

Gene Structure

To determine the gene structure, a cDNA for the CO gene was identifiedusing RT-PCR (Experimental Procedures). The sequence of the cDNAcontains an 1122 bp ORF that is derived from both ORFs identified in thegenomic sequence by removal of a 233 bp intron. Translation of this openreading frame is predicted to form a protein containing 373 amino acidswith a molecular mass of 42 kd. The transcription start site was notdetermined, but an in frame translation termination codon is locatedthree codons upstream of the ATG, indicating that the entire translatedregion was identified. The 3′ end of the transcript was located bysequencing four fragments produced by 3′-RACE. They all contained thepoly-A tail at different positions within 5 bases of each other.

Available data bases were searched for proteins sharing homology withthe predicted translation product of the CO gene. Searching the PROSITEdirectory detected no motifs within the CO protein. Moreover, a FASTAsearch comparing the CO protein sequence with those in GenBank detectedno significant homologies. Direct comparison of the CO sequence withthat of LUMINIDEPENDENS, the other flowering time gene cloned fromArabidopsis (Lee et al, 1994), detected no homology. However, analysisof the protein sequence by eye identified a striking arrangement ofcysteine residues that is present in two regions near the amino terminusof the CO protein. Each of these regions contains four cysteines in aC-X₂-C-X₁₆-C-X₂-C arrangement, that is similar to the zinc-fingerdomains of GATA-1 transcription factors (C-X₂-C-X₁₇-C-X₂-C).

Comparison of two 43 amino acid stretches that are directly adjacent toeach other within the predicted CO protein sequence and each of whichcontains one of the proposed zinc fingers, indicates striking homology:46% of the amino acids are identical and 86% are either identical orrelated. The conservation is most apparent on the carboxy side of eachfinger, which is again reminiscent of GATA1 transcription factors, inwhich this region is a basic domain required for DNA binding and ishighly conserved (Trainor et al, 1990; Brendel and Karlin, 1989; Ramainet al, 1993). In the CO protein this region is also positively charged:there is a net positive charge of 6 in the region adjacent to the aminofinger and of 3 in the one next to the carboxy finger.

Comparison of the CO protein sequence of the CO zinc fingers with 116amino acids that contain the zinc fingers of hGATA1 and are conservedbetween members of the GATA1 family (see Ramain et al, 1993) using theFASTA programme of the Wisconsin package identified one 81 amino acidregion of homology that spans both zinc fingers of CO and aligns thecysteines of the zinc fingers of hGATA1 and those of CO. Between theseregions of CO and hGATA1, twenty one percent of the amino acids areidentical and 65% are similar or identical. Therefore although CO is nota member of the GATA1 family it shows similarity to them in the regionof the zinc fingers and represents a new class of zinc-finger containingprotein.

A further indication that these regions are important for CO activity isthat the mutations in both the co-1 and co-2 alleles affect residuesthat are conserved between the proposed finger regions: co-2 changes anarginine on the carboxy side of the N-terminal finger to a histidine,and the co-1 deletion removes three amino acids from the carboxy side ofthe C-terminal finger.

Expression of CO mRNA in Long and Short Day Grown Plants

No CO cDNA clones were found by screening several Arabidopsis cDNAlibraries and the mRNA was not detected on Northern blots of polyA mRNAextracted from seedlings at the 3-4 leaf stage (data not shown). RT-PCRfollowed by Southern blotting and hybridisation to a CO specific probewas therefore used to detect the CO transcript. The RNA used in theseexperiments was isolated from seedlings at the 3-4 leaf stage, becausethis is just before the floral bud is visible under long days andtherefore seemed a likely time for the gene to be expressed.

Six independent RNA preparations made from plants growing under longdays all produced a hybridising fragment of the size expected for the COcDNA. No difference in abundance of the CO transcript was detectedbetween wild-type or co-1 mutant plants, suggesting that activity of theCO gene is not required to promote its own transcription.

Flowering Time under Long Days is Influenced by CO Gene Dosage.

Plants that are heterozygous for a wild-type allele and either co-1 orco-2 flower at a time intermediate between co homozygotes and Landsbergerecta under long days (Koorneef et al, 1991; F. Robson, unpublished).Sequencing of these mutant alleles demonstrated that they both containin frame alterations to the amino acid sequence. This might suggest twomodels for the partial dominance of co. The mutant alleles might giverise to an altered product that interferes with floral induction, or themutations might cause loss of function and the two-fold reduction in thelevel of the CO protein in a heterozygote lead to a delay in floweringtime (haplo-insufficiency). The haplo-insufficiency explanation isfavoured by the results included herein.

In the complementation experiments, transgenic plants containing twocopies of cosmids B or C and homozygous for the co-2 allele oftenflowered at the same time as wild-type plants under long days. If themutant allele encoded a product that interfered with the activity of thewild-type protein, then this would not be expected to occur. Moreover,the need to use RT-PCR to detect the CO transcript suggests that it ispresent at very low levels, which is consistent with the possibilitythat further reductions in transcript level causes late flowering.

Increases in the dosage of CO can lead to slightly earlier floweringunder long days. This was concluded from the observation that some ofthe transgenic lines carrying extra copies of the CO gene floweredslightly earlier than wild type plants (Tables 1 and 2). Thisobservation, together with the haplo-insufficiency phenotype discussedabove, suggests that the level of expression of CO is a criticaldeterminant of flowering time of Arabidopsis under long days.

Methods

Growth Conditions and Measurement of Flowering Time

Flowering time was measured under defined conditions by growing plantsin Sanyo Gallenkamp Controlled Environment rooms at 20° C. Short dayscomprised a photoperiod of 10 hours lit with 400 Watt metal halide powerstar lamps supplemented with 100 watt tungsten halide lamps. Thisprovided a level of photosynthetically active radiation (PAR) of 113.7μmoles photons m⁻²s⁻¹ and a red:far red light ration of 2.41. A similarcabinet and lamps were used for the long day. The photoperiod was for 10hours under the same conditions used for short days and extended for afurther 8 hours using only the tungsten halide lamps. In this cabinetthe combination of lamps used for the 10 hour period provided a PAR of92.9 μmoles photons m⁻² S-1 and a red:far red ratio of 1.49. The 8 hourextension produced PAR of 14.27 μmoles m⁻² s⁻¹ and a red:far-red ratioof 0.66.

The flowering times of large populations of plants were measured in thegreenhouse. In the summer the plants were simply grown in sunlight. Inwinter supplementary light was provided so that the minimum daylengthwas 16 hours.

To measure flowering time, seeds were placed at 4° C. on wet filterpaper for 4 days to break dormancy and were then sown on soil.Germinating seedlings were usually covered with cling film or propagatorlids for the first 1-2 weeks to prevent dehydration. Flowering time wasmeasured by counting the number of leaves, excluding the cotyledons, inthe rosette at the time the flower bud was visible. Leaf numbers areshown with the standard error at 95% confidence limits. The number ofdays from sowing to the appearance of the flower bud was also recorded,but is not shown. The close correlation between leaf number andflowering time was previously demonstrated for Landsberg erecta and coalleles (Koorneef et al, 1991).

Plant Material

The standard wild-type genotype used was Arabidopsis thaliana Landsbergerecta. The co-1 mutation was isolated by Redei (1962) and is in anERECTA background, that in our experiments showed no detectable RFLPs orsequence variation from Landsberg erecta. The co-2 allele was isolatedin Landsberg erecta (Koornneef et al, 1991). The details of the linesused for the accurate RFLP mapping of co were described previously(Putterill et al, 1993).

In all cases described, lines carrying co-2 also carried tt4, althoughin order not to over-complicate the genotype descriptions in the textthis is not mentioned. The tt4 mutation is within the chalcone synthasegene and prevents anthocyanin accumulation in the seed coat, but doesnot affect flowering time (Koornneef et al, 1983). The mutation islocated on chromosome 5, approximately 3.3 cM from co (Putterill et al,1993). The use of a co-2 tt4 line was useful in confirming thatindividual plants did carry the co-2 mutation.

RNA Extractions

RNA was extracted using a method which is a modified version of thatdescribed by Stiekma et al (1988). Approximately 5 g of tissue frozen inliquid nitrogen was ground in a coffee grinder and extracted with amixture of 15 ml of phenol and 15 ml of extraction buffer (50 mM TrispH8, 1 mM EDTA, 1% SDS). The mixture was shaken, centrifuged and 25 mlof the aqueous layer recovered. This was then shaken vigorously with amixture of 0.7 ml 4M sodium chloride, ml phenol and 10 ml of chloroform.The aqueous layer was recovered after centrifugation and extracted with25 ml of chloroform. The RNA was then precipitated from 25 ml of theaqueous layer by the addition of 2 ml of 10 M LiCL, and the precipitaterecovered by centrifugation. The pellet was dissolved in 2 ml DEPC waterand the RNA precipitated by the addition of 0.2 ml of 4M sodium chlorideand 4 ml of ethanol. After centrifugation the pellet was dissolved in0.5 ml of DEPC water and the RNA concentration determined.

DNA Extractions

Arabidopsis DNA was performed by a CTAB extraction method described byDean et al (1992).

Isolation of cDNA by RT-PCR

Total RNA was isolated from whole seedlings at the 2-3 leaf stagegrowing under long days in the greenhouse. For first strand cDNAsynthesis, 10 μg of RNA in a volume of 10 μl was heated to 65° C. for 3minutes, and then quickly cooled on ice. 10 μl of reaction mix was madecontaining 1 μl of RNAsin, 1 μl of standard dT₁₇-adapter primer (1μg/μl; Frohman et al, 1988), 4 μl of 5×reverse transcriptase buffer (250mM Tris HCl pH8.3, 375 mM KCl, 15 mM MgCl₂), 2 μl DTT (100 mM), 1 μldNTP (20 mM), 1 μl reverse transcriptase (200 units, M-MLV Gibco). Thisreaction mix was then added to the RNA creating a final volume of 201.The mixture was incubated at 42° C. for 2 hours and then diluted to 200μl with water.

10 μl of the diluted first strand synthesis reaction was added to 901 ofPCR mix containing 4 μl 2.5 mM DNTP, 10 μl 10×PCR buffer (Boehringerplus Mg), 1 μl of a 100 ng/μl solution of each of the primers, 73.711 ofwater and 0.3 μl of 5 units/μl Taq polymerase (Boehringer or CetusAmplitaq). The primers used were CO₄₉ (5′GCTCCCACACCATCAAACTTACTAC 5′end located 38 bp upstream of translational start of CO) and COSO(5′CTCCTCGGCTTCGATTTCTC 5′ end located 57 bp upstream of translationaltermination codon of CO). The reaction was performed at 94° C. for 1minute, 34 cycles of 55° C. for 1 minute, 72° C. for 2 minutes and thenfinally at 72° C. for 10 minutes. 20 μl of the reaction was separatedthrough an agarose gel, and the presence of a fragment of the expectedsize was demonstrated after staining with ethidium bromide. The DNA wastransferred to a filter, and the fragment of interest was shown tohybridise to a short DNA fragment derived from the CO gene. Theremainder of the PCR reaction was loaded onto another gel, the amplifiedfragment was extracted, treated with T4 DNA polymerase and ligated toBluescript vector (Stratagene) cleaved with EcoRV. The PCR reaction wasdone in duplicate, and two independently amplified cDNAs were sequencedto ensure that any PCR induced errors were detected.

Isolation of cDNA Fragments by 3′ RACE

First strand cDNA synthesis was performed using the same conditions, RNApreparation and dT₁₇-adapter as described above for RT-PCR. The PCR wasthen performed using the standard adapter primer (5′gactcgagtcgacatcg;Frohman et al, 1988) and the CO₄₉ primer described above. The PCRconditions were the same as described above, except that theamplification cycle was preceded by a 40 minute extension at 72° C. 20μl of the reaction was separated through an agarose gel, and a smear offragments between 550 bp and 1.6 kb in length was detected. Theremainder of the reaction was loaded on a similar gel, the regionpredicted to contain fragments of 1-2 kb was excised, the DNA extractedand subjected to a second round of PCR using the adapter primer andanother CO specific primer (CO28, 5′tgcagattctgcctacttgtgc, 51 endlocated 94 bp downstream of translational start site of CO). When thisPCR was monitored on an agarose gel a fragment around the expected sizeof 1.3 kb was detected. This fragment was extracted from the gel,treated with T4 DNA polymerase and ligated to Bluescript DNA cleavedwith EcoRV. Four amplified fragments recovered from two independentamplifications were sequenced entirely. All four were polyadenylated atslightly different positions, as described in the text.

Detection of CO Transcript by RT-PCR

First strand synthesis was performed exactly as described above for themethod used to isolate a cDNA clone, except that the RNA was isolatedfrom plant grown in controlled environment cabinets at different stages.All samples were harvested and analysed in duplicate.

The primers used to amplify CO cDNA are described in the text andpreviously in Experimental Procedures. The primers used to amplify thecDNA of the gene used as a control were C01 (5′ TGATTCTGCCTACTTGTGCTC)and CO₂ (5′. GCTTGGTTTGCCTCTTCATC).

DNA Sequencing

The Sanger method was used to sequence fragments of interest inserted ina Bluescript plasmid vector. Reactions were performed using a Sequenasekit (United States Biochemical Corporation).

Isolation of Clones Containing Each of the Seven Co Alleles

DNA was extracted from plants homozygous for each of the alleles.Approximately lng of genomic DNA was diluted to loll with water andadded to 901 of reaction mix, as described above except that primersCO₄₁ (5′ggtcccaacgaagaagtgc 5′ end located 263 bp upstream oftranslational start codon of CO) and CO₄₂ (5′cagggaggcgtgaaagtgt 5′ endlocated 334 bp downstream of translational stop codon of CO) were used.The PCR conditions were: 94° C. for 3 minutes, followed by 34 cycles of94° C. for 1 minute, 55° C. for 1 minute, 72° C. for 2 minutes and thenfinally 72° C. for 10 minutes. In each case this produced a majorfragment of the expected size, 1.95 kb. The PCR was carried out induplicate for each allele. In each case the reactions were extractedwith phenol and chloroform, ethanol precipated and treated with T4 DNApolymerase. The reactions were then separated through an agarose gel,the fragment purified and ligated to SK+ Bluescript cleaved with EcoRV.Ligations were introduced into E. coli DH5 alpha and the recombinantplasmids screened by colony PCR for those carrying an insertion of theexpected size. The DNA sequences of two independently amplifiedfragments derived from each allele were determined.

Screening Phage and Cosmid Libraries

A lysate of the cosmid library (Olszewski and Ausubel, 1988) was used toinfect E. coli DH5 alpha, and twenty thousand colonies were screenedwith the probes described in the text. Three cDNA libraries werescreened to try to identify a CO cDNA. The number of plaques screenedwere 5×10⁵ from the “aerial parts” library (supplied by EC ArabidopsisStock Center, MPI, Cologne), 3×10⁵ plaques of a library made from plantsgrowing in sterile beakers (made by Dr A. Bachmair and supplied by theEC Arabidopsis Stock Center) and 1×10⁶ plaques of the CD4-71-PRL2library (supplied by the Arabidopsis Biological Resource Center at OhioState University).

Transformation of Arabidopsis

The cosmids containing DNA from the vicinity of CO were mobilised intoAgrobacterium tumefaciens C58C1, and the T-DNA introduced intoArabidopsis plants as described by Valvekens et al, 1988. Roots ofplants grown in vitro were isolated and grown on callus-inducing medium(Valvekens et al, 1988) for 2 days. The roots were then cut into shortsegments and co-cultivated with Agrobacterium tumefaciens carrying theplasmid of interest. The root explants were dried on blotting paper andplaced onto callus-inducing medium for 2-3 days. The Agrobacterium werewashed off, the roots dried and placed onto shoot inducing medium(Valvekens et al, 1988) containing vancomycin to kill the Agrobacteriumand kanamycin to select for transformed plant cells. After approximately6 weeks green calli on the roots start to produce shoots. These areremoved and placed in petri dishes or magenta pots containinggermination medium (Valvekens et al, 1988). These plants produce seedsin the magenta pots. These are then sown on germination mediumcontaining kanamycin to identify transformed seedlings containing thetransgene (Valvekens et al, 1988).

EXAMPLE 2 Construction of Promoter Fusions to the CO Open Reading Frame

A PvuII-EcoRV fragment containing the entire CO gene was inserted intothe unique EcoRV site of the Bluescript™ plasmid. The CO gene fragmentwas inserted in the orientation such that the end defined by the EcoRVsite was adjacent to the HindIII site within the Bluescript™ polylinker.This plasmid was called pCO1. The PvuII-EcoRV fragment inserted in pCO1contains two HindIII sites both 5′ of the point at which translation ofthe CO protein is initiated. Cleavage of pCO1 with HindIII produces afragment that contains the entire CO open reading frame from 63 bpupstream of the initiation of translation to the PvuII site which isdownstream of the polyadenylation site, as well as all of the bluescriptvector from the PvuII/EcoRV junction created by the ligation event tothe HindIII site within the polylinker. Ligation of a promotercontaining fragment in the appropriate orientation to this fragmentcreates a fusion of the promoter to the CO open reading frame. Forinstance, a variety of promoters may be inserted at this position, asdiscussed below.

A GSTII Promoter Fusion to the CO Open Reading Frame

The GSTII promoter-containing fragment was derived from plasmid pGIE7(supplied by Zeneca) as a HindIII-NdeI fragment, whose sequence is shownin FIG. 2. An oligonucleotide adapter (5′ TACAAGCTTG) was inserted atthe NdeI site to convert it into a HindIII site. The resulting plasmidwas then cleaved with HindIII, and the promoter containing fragmentligated to the HindIII fragment containing the CO open reading frame. Arecombinant plasmid that contained the GSTII promoter in the orientationsuch that transcription would occur towards the CO open reading framewas identified by PstI digestion. The GSTII-CO fusion was then movedinto a binary vector described by Jones et al (1992) as a ClaI-XbaIfragment.

The binary vector may be introduced into an Agrobacterium tumefaciensstrain and used to introduce the fusion into dicotyledonous species, orthe fusion may be introduced into monocotyledonous species by a nakedDNA transformation procedure. Protocols for transformation have beenestablished for many species, as discussed earlier.

The GSTII promoter may be used to induce-expression of the CO gene byapplication of an exogenous inducer such as the herbicide safenersdichloramid and flurazole, as described in WO93/01294 (Imperial ChemicalIndustries Limited).

A Heat Shock Promoter Fusion to the CO Open Reading Frame

An alternative inducible system makes use of the well characterisedsoybean heat shock promoter, Gmhsp17.3B, which is induced by expressionin response to exposure to high temperatures in a variety of plantspecies (discussed by Balcells et al, 1994). The promoter is availableas a 440 bp XbaI-XhoI fragment (Balcells et al, 1994) which aftertreatment with T4 DNA polymerase may be inserted into pCO1 cleaved withHindIII, as described above for the GSTII fusion. The resulting fusionmay then be introduced into the binary vector, Agrobacterium tumefaciensand transgenic plants, as described earlier. CO expression may beinduced by exposing plants to temperatures of approximately 40° C.

Fusion to the CO Gene of a Modified CaMV 35S Promoter ContainingTetracycline Resistance Gene Operators

A modified CaMV 35S promoter which contains three operators from thebacterial tetracycline resistance gene has been developed as achemically inducible system. In the presence of the tetracycline genereporessor protein this promoter is inactive, but this repression isovercome by supplying plants with tetracycline (Gatz et al, 1992). Thisis an alternative chemically inducible promoter which may be fused tothe CO open reading frame. The promoter is available as a SmaI-XbaIfragment (Gatz et al, 1992) which after treatment with T4 DNA polymerasemay be inserted into pCO1 cleaved with HindIII as described earlier.After introduction of this fusion into plants also containing therepressor gene, CO expression may be induced by supplying the plantswith tetracycline.

A CaMV 35S Promoter Fusion to the CO Open Reading Frame

The CaMV 35S promoter was isolated from plasmid pJIT62 (physical map ofwhich is shown in FIG. 4). The KpnI-HindIII fragment containing the CaMV35S promoter was fused to the CO open reading frame by ligation toplasmid pCOI cleaved with HindIII and KpnI. The single KpnI site wasthen converted to a ClaI site by insertion of an adapter oligonucleotide(5′TATCGATAGTAC), and then a ClaI-BamHI fragment containing the promoterfused to the CO ORF was inserted into a binary vector. The fusion may beintroduced into transgenic plants either by the use of Agrobacteriumtumefaciens or as naked DNA, as described earlier.

Fusion of the Meri 5 Promoter to the CO Open Reading Frame

The meri 5 promoter is available as a 2.4 kb BglII-StuI fragment(Medford et al, 1991). This may be treated with T4-DNA polymerase andinserted into the HindIII site of pCO1 as described above. The fusionmay then be introduced into transgenic plants, as described above.

EXAMPLE 3 Flowering Time under Short Days of Plants Carrying ExtraCopies of CO

Under short day conditions wild type plants and co-2 homozygotes bothflower at approximately the same time (Table 1), suggesting that the COproduct is not required for flowering under these conditions. However,under short days, several of the co-2 tt4 families carrying the T-DNAsderived from cosmids B and C flowered earlier than both the parentalco-2 line and wild type (Table 1). In particular, 2 lines (4 and 6)carrying cosmid C flowered much earlier than wild type. This suggestedthat in some families a transgenic copy of CO was expressed at a higherlevel than the original copy, or expressed ectopically, and that thisled to earlier flowering under short days than that of wild type plants.

Cosmid B was also introduced into wild-type Landsberg erecta plants andT2 plants homozygous for the transgene at a single locus were identifiedin the same way as described above (Table 1) of the 3 independenttransformants analysed in the T3 generation, one flowered slightlyearlier than wild-type plants under long days, and significantly earlierunder short days (Table 1). This again suggested that at least at somechromosomal locations, extra copies of the CO gene can cause earlyflowering.

EXAMPLE 4 Influencing Flowering Characteristics using a CaMV 35SPromoter/CO Gene Fusion

A fusion of a CaMV 35S promoter to the CO open reading frame wasintroduced into co mutant Arabidopsis plants. First the ClaI-BamH1fragment described in Example 2 was inserted into the ClaI-BamH1 sitesof binary vector SLJ1711 (Jones et al., 1992). An Agrobacteriumtumefaciens strain carrying this vector was then used for transformationof Arabidopsis root explants, followed by regeneration of transformedplants as described by Valvekens et al. (1988).

The resulting transgenic plants flowered significantly earlier thanwild-type under both inductive and non-inductive conditions. Forexample, under inductive long-day conditions, wild-type plants floweredafter forming approximately 5 leaves, while the transgenic plantsflowered with 3-4 leaves. Under non-inductive short days, wild-typeplants flowered with approximately 20 leaves, while the transgenicplants formed 3-4 leaves. The use of promoter fusions to increase theabundance of the CO mRNA, or to alter the specificity of COtranscription, can therefore be used to lead to dramatically earlierflowering than that of wild-type plants.

In addition, some of the transgenic plants carrying the fusion of theCaMV 35S promoter to the CO gene formed a terminal flower at the end ofthe shoot. The shoot of wild-type plants shows indeterminate growth,growing and forming flowers on the sides of the shoot indefinately.However, terminal flower (tfl) mutants show determinate growth,terminating shoot development prematurely by forming a flower at theapex of the shoot. In wild-type plants, the TFL gene is thought toprevent the formation of flowers at the apex of the shoot, by preventingthe expression of genes that promote flower development, such as LEAFY(LFY), in the apical cells. This is supported by the observations thatLFY is expressed in the shoot apex of tfl mutants but not wild typeplants, and that fusions of the CaMV 35S promoter to LFY causetransgenic plants to form a terminal flower (Weigel and Nilsen, 1995).While not intending to be bound by any particular theory, the fusion ofCO to the CaMV 35S promoter might therefore cause a terminal flower byactivating genes such as LFY at the apex of the shoot.

The two phenotypes caused by the CO fusion to the CaMV 35S promoter,early flowering and the formation of a terminal flower, may be separatedby the use of other promoters. For example, terminal flower formationmight be optimised by using a promoter, such as that of the meri 5 genementioned above, that is expressed mainly in the apical meristem, whileearly flowering without a terminal flower might result from expressingthe gene from the promoters that are not well expressed in the apicalmeristem, such as a heat-shock promoter.

EXAMPLE 5 Cloning of a CO Homologue from Brassica napus

Low stringency hybridizations (Sambrook et al., 1989) were used toscreen a lambda genomic DNA library made from Brassica napus DNA.Positively hybridizing clones were analysed and classified byconstructing maps of their restriction enzyme cleavage sites (usingHindIII, XhoI, EcoRV, XbaI, EcoRI and NdeI) CO homologues weredistinguished from other members of the CO gene family because of thesimilarity of their restriction enzyme map with that of the ArabidopsisCO gene, and because a second gene that is located close to CO in theArabidopsis genome was shown to be present at a similar position in theBrassica clones. Two CO homologues, corresponding to the genes presenton Brassica napus linkage groups N10 and N19 (Sharpe et al., 1995), werethen sub-cloned into plasmids and sequenced. The sequence of the genefrom the N10 linkage group is shown in FIG. 5 and that from the N19linkage group is shown in FIG. 6. The amino acid sequences of theproteins encoded by these genes are very similar to that of theArabidopsis CO gene, particularly in the regions demonstrated bymutagenesis to be important for the functioning of the protein; 86 aminoacids across the zinc-finger region are 84% identical, and a 50 aminoacid region at the carboxy terminus of the protein, that is affected intwo of the Arabidopsis mutants, is 88% identical. These two regions arethe most conserved, with the intervening 187 amino acids from the middleof the protein being 64% identical.

This sequence analysis indicates that CO homologues can be isolated fromplant species other than Arabidopsis. In addition, restriction fragmentlength polymorphism mapping strongly suggests that CO homologues areimportant in regulating flowering time of other species. For example, inBrassica nigra a CO homologue closely co-segregates with a majorquantitative trait locus for flowering time (U. Lagercrantz et al, inpress), and in Brassica napus CO homologues mapping to linkage groups N2and N12 co-segregate with allelic variation for flowering time. TABLE 1Flowering time and segregation of kanamycin resistance in T2 and T3generations of co-2 carrying the T-DNA of cosmid B or C plants AverageLN Average LN Ratio of at at Ratio of Transgenic Km flowering floweringKm co tt4 resistant of T3 of T3 resistant line seedlings individualindividual seedlings scored in T2¹ under LDs² under SDs² in T3 cosmid B  3:1 4.6 +/− 0.4 14.0 +/− 2.5 1:0 line 1 cosmid B 3.7:1 4.2 +/− 0.318.5 +/− 1.1 1:0 line 2 cosmid B 2.9:1 4.6 +/− 0.8 13.5 +/− 4.1 1:0 line3 cosmid B 2.4:1 4.6 +/− 0.8 16.4 +/− 2.2 1:0 line 4 cosmid B 3.0:1 5.1+/− 0.5 18.5 +/− 1.1 1:0 line 5 cosmid C 2.9:1 4.6 +/− 0.6 20.6 +/− 3.81:0 line 1 cosmid C 3.4:1 3.9 +/− 0.4 11.7 +/− 3.2 1:0 line 2 cosmid C3.3:1 4.0 +/− 0.4 20.4 +/− 1.2 1:0 line 3 cosmid C 4.9:1 3.7 +/− 0.3³7.6 +/− 5.3 1:0 line 4 cosmid C   3:1 4.9 +/− 0.6 17.7 +/− 2.1 1:0 line5 cosmid C 3.8:1 3.5 +/− 0.5  6.6 +/− 1.4 1:0 line 6 Landsberg — 5.1 +/−0.8 18.9 +/− 2.4 — erecta co-2 — 12.4 +/− 1.0  18.1 +/− 3.4 —

Flowering time was measured by counting the number of leaves present atthe time that the flower bud appeared in the centre of the rosette(Koornneef et al, 1991; Experimental Procedures).

¹ Over 80 plants were tested in each family, except for cosmid B line 3in which 35 plants were used.

² 10 plants from each family were tested

³ The large standard error in this population was due to 2 plants thatflowered with 18 leaves, while the other 8 has a leaf number of 5.1+/−1at flowering. Southern analysis of this line using a T-DNA fragment asprobe identified 6 hybridising fragments. The variation in floweringtime could therefore be due to the segregation of one T-DNA copy that isrequired for early flowering, or to the occurrence of co-suppressionrepressing activity of the transgenes in some individuals. TABLE 2Flowering time of transgenic wild-type plants carrying extra copies ofthe CO gene Average LN at Average LN at Landsberg flowering of floweringof Ratio of erecta Km T3 T3 kanamycin transgenic in individualsindividuals resistance line T2¹ under LDs² under SDs² in T3 cosmid B3.4:1 4.4 +/− 1.0 18.1 +/− 2.1 1:0 line 1 cosmid B 5.9:1 3.2 +/− 0.610.1 +/− 2.2 1:0 line 2 cosmid B 2.8:1 4.0 +/− 0.5 19.6 +/− 2.2 1:0 line3 Landsberg 5.1 +/− 0.8 18.9 +/− 2.4 — erecta co-2 12.4 +/− 1.0  18.1+/− 3.7 —¹Over 80 plants were tested in each family.²10 plants from each family were tested.

REFERENCES

-   Balcells et al., (1994) The Plant Journal 5, 755-764.-   Bancroft I, Wolk CP (1988) Nucl. Acids Res. 16:7405-7418-   Benfey et al., EMBO J. 9: 1677-1684 (1990a).-   Benfey et al., EMBO J. 9: 1685-1696 (1990b).-   Becker et al., (1994) The Plant Journal 5, 299-307.-   Bower, R. and Birch, R. G. (1992) The Plant Journal 2, 409-416.-   Brendel, V. and Karlin, S. (1989) Proc Natl Acad Sci USA 86,    5698-5702.-   Cao et al., (1992) Plant Cell Reports 11, 586-591.-   Chang et al., (1988) Proc Natl Acad Sci USA 85:6856-6860-   Christou et al., (1991) Bio/Technol. 9, 957-962.-   Coulson et al., (1988) Nature 335:184-186-   Dale, P. J. and Irwin, J. A. (1994) in Designer oil crops ed.    Murphy D. J. VCH, Weinheim, Germany.-   Datta et al., (1990) Bio/Technol. 8, 736-740.-   Dean et al., (1992) in Arabidopsis thaliana. Plant Journal 2, 69-82.-   Frohman et al., (1988) Proc Natl Acad Sci USA 85, 8998-9002.-   Gatz et al., (1992) Plant Journal 2, 397-404.-   Gill, G. and Ptashne, M. (1988) Nature 334, 721-724.-   Gordon-Kamm et al., Plant Cell 2, 603-618.-   Heard et al., (1989) Nucl Acids Res 17:5861-   Jones et al., (1992) Transgenic Research 1, 285-297.-   Koornneef et al., (1991) Mol Gen Genet 229, 57-66.-   Koornneef et al., (1983) Heredity 74, 265-272.-   Koziel et al., (1993) Bio/Technol. 1, 194-200.-   Lagercrantz, U., Putterill, J., Coupland, G. and Lydiate D. (1995)    comparative mapping in Arabidopsis and Brassica, fine scale genome    collinearity and congruence of genes controlling flowering time.    Plant Journal in press.-   Lee et al., (1994) The Plant Cell 6, 75-83. Martin, D. I. K. and    Orkin, S. H. (1990) Genes and Development 4, 1886-1898.-   Medford, J. I. (1992) Plant Cell 4, 1029-1039.-   Medford et al., (1991) Plant Cell 3, 359-370.-   Moloney et al., (1989) Plant Cell Reports 8, 238-242.-   Napoli et al., (1990) The Plant Cell 2, 279-289.-   Olszewski, N. and Ausubel, F. M. (1988) Nucleic Acids Res. 16,    10765-10782.-   Potrykus (1990) Bio/Technology 8, 535-542.-   Ptashne, M. and Gann, A. F. (1990) Nature 346, 329-331.-   Putterill et al., (1993) Mol Gen Genet 239, 145-157.-   Putterill et al., (1995) Cell 80, 847-857.-   Radke et al., (1988) Theoretical and Applied Genetics 75, 685-694.-   Ramain et al., (1993) Development 119, 1277-1291.-   Redei, G. P. (1962) Genetics 47, 443-460.-   Relichová J (1976) Arabidopsis Inf Serv 13:25-28-   Rhodes et al., (1988) Science 240, 204-207.-   Rothstein et al., (1987) Proc. Natl. Acad. Sci. USA 84, 8439-8443.-   Sambrook et al., (1989). Molecular Cloning: A Laboratory Manual.    (Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory).-   Sánchez-Garciá, I. and Rabbits, T. H. (1994) Trends in Genetics 10,    315-320.-   Schmidt R, Dean C (1992) Genome Analysis Vol. 4 Strategies for    physical mapping Cold Spring Harbour Laboratory Press 71-98-   Schmidt R et al., (1992) Aust J Plant Physiol 19: 341-351-   Sharpe et al., (1995) Frequent non-reciprocal translocation in the    amphidiploid genome of oilseed rape (Brassica napus). Genomics in    press.-   Shimamoto et al., (1989) Nature 338, 2734-2736.-   Smith et al., (1988) Nature 334, 724-726.-   Somers et al., (1992) Bio/Technol. 10, 1589-1594.-   Stiekema et al., (1988) Plant Molecular Biology 11, 255-269.-   Tai T, Tanksley S (1991) Plant Mol Biol Rep. 8:297-303-   Trainor et al., (1990) Nature 343, 92-96-   van der Krol et al., (1990) The Plant Cell 2, 291-299.-   Vasil et al., (1992) Bio/Technol. 10, 667-674.-   Valvekens et al., (1988) Proc. Natl. Acad. Sci. USA 87, 5536-5540.-   Weigel et al., (1992) Cell 69, 843-859.-   Weigel and Nilssen (1995) A development switch sufficient for flower    initiation in diverse plants. Nature in press.-   Zhang et al., (1992) The Plant Cell 4, 1575-1588.

1-33. (canceled)
 34. A nucleic acid isolate having a nucleotide sequencecoding for a polypeptide which includes the amino acid sequence shown inFIG.
 1. 35. Nucleic acid according to claim 34 wherein coding sequenceis the coding sequence shown in FIG.
 1. 36. Nucleic acid according toclaim 34 wherein the coding sequence is a mutant, allele or derivativeof the coding sequence shown in FIG.
 1. 37. A nucleic acid isolatehaving a nucleotide sequence coding for a polypeptide which includes asequence mutant, allele or derivative of the CO amino acid sequence ofthe species Arabdiopsis thaliana shown in FIG. 1 or a homologue fromanother species, by way of insertion, deletion, addition or substitutionof one or more residues, or a said homologue, wherein expression of saidnucleotide sequence delays flowering in a transgenic plant, the timingof flowering being substantially unaffected by vernalisation.
 38. Anucleic acid isolate having a nucleotide sequence coding for apolypeptide which includes a sequence mutant, allele or derivative ofthe CO amino acid sequence of the species Arabidopsis thaliana shown inFIG. 1 or a homologue from another species, by way of insertion,deletion, addition or substitution of one or more residues, or a saidhomologue, wherein expression of said nucleotide sequence promotesflowering in a transgenic plant, the timing of flowering beingsubstantially unaffected by vernalisation.
 39. Nucleic acid according toclaim 38 able to complement a co mutation.
 40. Nucleic acid according toclaim 39 wherein said mutation is in Arabidopsis thaliana.
 41. Nucleicacid according to claim 37 wherein the encoded polypeptide includes anarrangement of cysteines characteristic of zinc fingers.
 42. Nucleicacid according to claim 41 wherein the arrangement of cysteines isC-X₂-C-X₁₆-C-X.
 43. Nucleic acid according to claim 37 where the encodedpolypeptide includes a zinc finger.
 44. Nucleic acid according to claim38 wherein said homologue has the amino acid sequence shown in FIG. 5 orFIG.
 6. 45. Nucleic acid according to claim 44 wherein said codingsequence is the coding sequence shown in FIG. 5 or FIG.
 6. 46. Nucleicacid according to claim 34 under the control of a regulatory sequencefor expression of said polypeptide.
 47. Nucleic acid according to claim46 wherein said regulatory sequence includes an inducible promoter. 48.Nucleic acid according to claim 47 wherein the promoter is derived froma maize gene for a 27 kD subunit of glutathione-S-transferase, isoformII.
 49. A nucleic acid isolate having a nucleotide sequencecomplementary to a coding sequence of claim 34 or a fragment of a saidcoding sequence suitable for use in anti-sense regulation of geneexpression.
 50. Nucleic acid which is DNA according to claim 34 whereinsaid nucleotide sequence or a fragment thereof is under control of aregulatory sequence for anti-sense transcription of said nucleotidesequence or a fragment thereof.
 51. Nucleic acid according to claim 50comprising an inducible promoter.
 52. Nucleic acid according to claim 51wherein the promoter is derived from a maize gene for a 27 kD subunit ofglutathione-S-transferase, isoform II.
 53. A nucleic acid vectorsuitable for transformation of a plant cell and comprising nucleic acidaccording to claim
 34. 54. A plant cell comprising nucleic acidaccording to claim
 34. 55. A plant cell according to claim 54 havingheterologous said nucleic acid within its genome.
 56. A plant cellaccording to claim 55 having more than one said nucleotide sequence perhaploid genome.
 57. A plant comprising plant cell according to claim 54.58. Selfed or hybrid progeny or a descendant of a plant according toclaim 57, or any part or propagule of such a plant, progeny ordescendant, such as seed.
 59. A method of influencing a floweringcharacteristic of a plant, the method comprising causing or allowingexpression of the polypeptide encoded by nucleic acid according to claim34 from that nucleic acid within cells of the plant.
 60. A method ofinfluencing a flowering characteristic of a plant, the method comprisingcausing or allowing transcription from nucleic acid according to claim34 within cells of the plant.
 61. A method of influencing a floweringcharacteristic of a plant, the method comprising causing or allowinganti-sense transcription from nucleic acid according to claim 49 withincells of the plant.
 62. A method of identifying and cloning COhomologues from plant species other than Arabidopsis thaliana whichmethod employs a nucleotide sequence derived from that shown in FIG. 1.63. Nucleic acid encoding a CO homologue obtained by the method of claim62, CO having the amino acid sequence shown in FIG.
 1. 64. Nucleic acidaccording to claim 63 which comprises a nucleotide sequence shown inFIG. 5 or FIG.
 6. 65. A method of identifying and cloning CO homologuesfrom plant species other than Arabadopsis thaliana which method employsa nucleotide sequence derived from a sequence shown in FIG. 5 or FIG. 6.66. Nucleic acid encoding a CO homologue obtained by the method of claim65, CO having the amino acid sequence shown in FIG. 1.